Go-Ahead: Improving Prior Knowledge Heuristics by Using Information Retrieved From Play Out Simulations
نویسندگان
چکیده
The proposal behind this paper is the introduction of a new agent denominated Go-Ahead: this is an automatic Go player that uses a new technique in order to improve the accuracy of the pre estimated values of the moves that are candidates to be introduced into the classical Monte Carlo tree search (MCTS) algorithm which is used by many of the current top agents for Go. Go-Ahead is built upon the framework of one of these agents: the well known open source automatic player Fuego, in which these pre estimated values are obtained by means of a heuristic called prior knowledge. Go-Ahead copes with the task of refining the calculations of these values through a new technique that performs a balanced combination between the prior knowledge heuristic and some relevant information retrieved from the numerous play out simulation phases that are repeatedly executed throughout the Monte Carlo search. With such a strategy, Go-Ahead provides the contribution of enhancing the MCTS process of choosing appropriate moves. Further, this new approach attenuates the supervision level inherent to this process due to the following fact: it allows for the lessening of the impact of the prior knowledge heuristics through strengthening the impact of play out information. The results obtained in tournaments against Fuego confirm the benefits and the contributions provided by this approach.
منابع مشابه
BTT-Go: An Agent for Go that Uses a Transposition Table to Reduce the Simulations and the Supervision in the Monte-Carlo Tree Search
This paper presents BTT-Go: an agent for Go whose architecture is based on the well-known agent Fuego, that is, its search process for the best move is based on simulations of games performed by means of MonteCarlo Tree Search (MCTS). In Fuego, these simulations are guided by supervised heuristics called prior knowledge and play-out policy. In this context, the goal behind the BTT-Go proposal i...
متن کاملAchieving Master Level Play in 9 × 9 Computer Go
The UCT algorithm uses Monte-Carlo simulation to estimate the value of states in a search tree from the current state. However, the first time a state is encountered, UCT has no knowledge, and is unable to generalise from previous experience. We describe two extensions that address these weaknesses. Our first algorithm, heuristic UCT, incorporates prior knowledge in the form of a value function...
متن کاملTropospheric aerosol profile information from high-resolution oxygen A-band measurements from space
Aerosols are an important factor in the Earth climatic system and they play a key role in air quality and public health. Observations of the oxygen A-band at 760 nm can provide information on the vertical distribution of aerosols from passive satellite sensors that can be of great interest for operational monitoring applications with high spatial coverage if the aerosol information is obtained ...
متن کاملHeuristics in Monte Carlo Go
Writing programs to play the classical Asian game of Go is considered one of the grand challenges of artificial intelligence. Traditional game tree search methods have failed to conquer Go because the search space is so vast and because static evaluation of board positions is extremely difficult. There has been considerable progress recently in using Monte Carlo sampling to select moves. This p...
متن کاملAll-Moves-As-First Heuristics in Monte-Carlo Go
We present and explore the effectiveness of several variations on the All-Moves-As-First (AMAF) heuristic in Monte-Carlo Go. Our results show that: • Random play-outs provide more information about the goodness of moves made earlier in the play-out. • AMAF updates are not just a way to quickly initialize counts, they are useful after every play-out. • Updates even more aggressive than AMAF can ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016